Reducing Type I Errors in Tn-Seq Experiments by Correcting the Skew in Read Count Distributions
نویسندگان
چکیده
Sequencing of transposon-mutant libraries using next-generation sequencing (Tn-Seq) has become a popular method for determining which genes and non-coding regions are essential for growth under various conditions in bacteria. For methods that rely on comparison of read-counts at transposon insertion sites, proper normalization of Tn-Seq datasets is vitally important. Real Tn-Seq datasets often exhibit a significant skew and can be dominated by high counts at a small number of sites (often for nonbiological reasons). If two datasets that are not appropriately normalized are compared, it might cause the artifactual appearance of conditionally essential genes in a statistical test, constituting type I errors (false positives). In this paper, we propose a novel method for normalization of Tn-Seq datasets that corrects for the skew in read count distributions by fitting them to a Beta-Geometric distribution. We show that this read-count correction procedure reduces the number of false positives when comparing replicate datasets grown under the same conditions (for which no genuine differences in essentiality are expected).
منابع مشابه
A Family of Skew-Slash Distributions and Estimation of its Parameters via an EM Algorithm
Abstract. In this paper, a family of skew-slash distributions is defined and investigated. We define the new family by the scale mixture of a skew-elliptically distributed random variable with the power of a uniform random variable. This family of distributions contains slash-elliptical and skew-slash distributions. We obtain the moments and some distributional properties of the new family of d...
متن کاملAn Extension of the Birnbaum-Saunders Distribution Based on Skew-Normal t Distribution
In this paper, we introducte a family of univariate Birnbaum-Saunders distributions arising from the skew-normal-t distribution. We obtain several properties of this distribution such as its moments, the maximum likelihood estimation procedure via an EM-algorithm and a method to evaluate standard errors using the EM-algorithm. Finally, we apply these methods to a real data set to demonstr...
متن کاملEFFECTIVENESS OF INSTRUCT COGNITIVE ERRORS IN THE WAY OF PHILOSOPHY FOR CHILDREN AND ADOLESCENTS, IN COGNITIVE ERRORS, WELL-BEING AND BLOOD SUGAR LEVELS OF CHILDREN AND ADOLESCENTS WITH TYPE I DIABETES
Background: Type 1 diabetes is a chronic disease that children and adolescents do not have the ability to care for themselves, despite having enough information about their self-care (nutrition, insulin, exercise, etc.). Self-care, such as any behavior, can be influenced by the way of thinking, and the philosophy teaching method can be a suitable educational tool for changing thinking. The purp...
متن کاملUniversal Count Correction for High-Throughput Sequencing
We show that existing RNA-seq, DNase-seq, and ChIP-seq data exhibit overdispersed per-base read count distributions that are not matched to existing computational method assumptions. To compensate for this overdispersion we introduce a nonparametric and universal method for processing per-base sequencing read count data called FIXSEQ. We demonstrate that FIXSEQ substantially improves the perfor...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014